AITopics | decision threshold

Collaborating Authors

decision threshold

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Threshold Learning for Optimal Decision Making

Nathan F. Lepora

Neural Information Processing SystemsMay-1-2026, 05:45:59 GMT

Decision making under uncertainty is commonly modelled as a process of competitive stochastic evidence accumulation to threshold (the drift-diffusion model). However, it is unknown how animals learn these decision thresholds. We examine threshold learning by constructing a reward function that averages over many trials to Wald's cost function that defines decision optimality. These rewards are highly stochastic and hence challenging to optimize, which we address in two ways: first, a simple two-factor reward-modulated learning rule derived from Williams' REINFORCE method for neural networks; and second, Bayesian optimization of the reward function with a Gaussian process. Bayesian optimization converges in fewer trials than REINFORCE but is slower computationally with greater variance. The REINFORCE method is also a better model of acquisition behaviour in animals and a similar learning rule has been proposed for modelling basal ganglia function.

artificial intelligence, machine learning, threshold, (17 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Industry:

Education (0.85)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

RS-Del: Edit Distance Robustness Certificates for Sequence Classifiers via Randomized Deletion

Neural Information Processing SystemsFeb-10-2026, 14:38:45 GMT

We present a case study on malware detection--a binary classification problem on byte sequences where classifier evasion is a well-established threat model.

classifier, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Oceania > Australia (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.93)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Add feedback

meval: A Statistical Toolbox for Fine-Grained Model Performance Analysis

Sutariya, Dishantkumar, Petersen, Eike

arXiv.org Machine LearningDec-22-2025

Analyzing machine learning model performance stratified by patient and recording properties is becoming the accepted norm and often yields crucial insights about important model failure modes. Performing such analyses in a statistically rigorous manner is non-trivial, however. Appropriate performance metrics must be selected that allow for valid comparisons between groups of different sample sizes and base rates; metric uncertainty must be determined and multiple comparisons be corrected for, in order to assess whether any observed differences may be purely due to chance; and in the case of intersectional analyses, mechanisms must be implemented to find the most `interesting' subgroups within combinatorially many subgroup combinations. We here present a statistical toolbox that addresses these challenges and enables practitioners to easily yet rigorously assess their models for potential subgroup performance disparities. While broadly applicable, the toolbox is specifically designed for medical imaging applications. The analyses provided by the toolbox are illustrated in two case studies, one in skin lesion malignancy classification on the ISIC2020 dataset and one in chest X-ray-based disease classification on the MIMIC-CXR dataset.

metric, statistical toolbox, subgroup, (12 more...)

arXiv.org Machine Learning

doi: 10.1007/978-3-032-05870-6_19

2512.17409

Country: Europe > Germany (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (0.90)
Health & Medicine > Therapeutic Area (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Cost-Aware Prediction (CAP): An LLM-Enhanced Machine Learning Pipeline and Decision Support System for Heart Failure Mortality Prediction

Yu, Yinan, Dippel, Falk, Lundberg, Christina E., Lindgren, Martin, Rosengren, Annika, Adiels, Martin, Sjöland, Helen

arXiv.org Artificial IntelligenceNov-20-2025

Objective: Machine learning (ML) predictive models are often developed without considering downstream value trade-offs and clinical interpretability. This paper introduces a cost-aware prediction (CAP) framework that combines cost-benefit analysis assisted by large language model (LLM) agents to communicate the trade-offs involved in applying ML predictions. Materials and Methods: We developed an ML model predicting 1-year mortality in patients with heart failure (N = 30,021, 22% mortality) to identify those eligible for home care. We then introduced clinical impact projection (CIP) curves to visualize important cost dimensions - quality of life and healthcare provider expenses, further divided into treatment and error costs, to assess the clinical consequences of predictions. Finally, we used four LLM agents to generate patient-specific descriptions. The system was evaluated by clinicians for its decision support value. Results: The eXtreme gradient boosting (XGB) model achieved the best performance, with an area under the receiver operating characteristic curve (AUROC) of 0.804 (95% confidence interval (CI) 0.792-0.816), area under the precision-recall curve (AUPRC) of 0.529 (95% CI 0.502-0.558) and a Brier score of 0.135 (95% CI 0.130-0.140). Discussion: The CIP cost curves provided a population-level overview of cost composition across decision thresholds, whereas LLM-generated cost-benefit analysis at individual patient-levels. The system was well received according to the evaluation by clinicians. However, feedback emphasizes the need to strengthen the technical accuracy for speculative tasks. Conclusion: CAP utilizes LLM agents to integrate ML classifier outcomes and cost-benefit analysis for more transparent and interpretable decision support.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.15357

Country: Europe > Sweden (0.30)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Providers & Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.89)

Add feedback

RS-Del: Edit Distance Robustness Certificates for Sequence Classifiers via Randomized Deletion

Neural Information Processing SystemsOct-8-2025, 12:13:47 GMT

We present a case study on malware detection--a binary classification problem on byte sequences where classifier evasion is a well-established threat model.

certificate, classifier, robustness, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Oceania > Australia (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.93)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Add feedback

Critical appraisal of artificial intelligence for rare-event recognition: principles and pharmacovigilance case studies

Noren, G. Niklas, Meldau, Eva-Lisa, Ellenius, Johan

arXiv.org Artificial IntelligenceOct-7-2025

Many high-stakes AI applications target low-prevalence events, where apparent accuracy can conceal limited real-world value. Relevant AI models range from expert-defined rules and traditional machine learning to generative LLMs constrained for classification. We outline key considerations for critical appraisal of AI in rare-event recognition, including problem framing and test set design, prevalence-aware statistical evaluation, robustness assessment, and integration into human workflows. In addition, we propose an approach to structured case-level examination (SCLE), to complement statistical performance evaluation, and a comprehensive checklist to guide procurement or development of AI models for rare-event recognition. We instantiate the framework in pharmacovigilance, drawing on three studies: rule-based retrieval of pregnancy-related reports; duplicate detection combining machine learning with probabilistic record linkage; and automated redaction of person names using an LLM. We highlight pitfalls specific to the rare-event setting including optimism from unrealistic class balance and lack of difficult positive controls in test sets - and show how cost-sensitive targets align model performance with operational value. While grounded in pharmacovigilance practice, the principles generalize to domains where positives are scarce and error costs may be asymmetric.

large language model, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2510.04341

Country: Europe (0.46)

Genre: Research Report (1.00)

Industry:

Information Technology (0.93)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Applied AI (1.00)
(4 more...)

Add feedback

The Impact of Pseudo-Science in Financial Loans Risk Prediction

Scarone, Bruno, Baeza-Yates, Ricardo

arXiv.org Artificial IntelligenceJul-25-2025

We study the societal impact of pseudo-scientific assumptions for predicting the behavior of people in a straightforward application of machine learning to risk prediction in financial lending. This use case also exemplifies the impact of survival bias in loan return prediction. We analyze the models in terms of their accuracy and social cost, showing that the socially optimal model may not imply a significant accuracy loss for this downstream task. Our results are verified for commonly used learning methods and datasets. Our findings also show that there is a natural dynamic when training models that suffer survival bias where accuracy slightly deteriorates, and whose recall and precision improves with time. These results act as an illusion, leading the observer to believe that the system is getting better, when in fact the model is suffering from increasingly more unfairness and survival bias.

artificial intelligence, dataset, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2507.16182

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.54)

Industry:

Banking & Finance > Loans (1.00)
Government (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Towards Improved Research Methodologies for Industrial AI: A case study of false call reduction

Pfab, Korbinian, Rothering, Marcel

arXiv.org Artificial IntelligenceJun-18-2025

--Are current artificial intelligence (AI) research methodologies ready to create successful, productive, and profitable AI applications? This work presents a case study on an industrial AI use case called false call reduction for automated optical inspection to demonstrate the shortcomings of current best practices. We identify seven weaknesses prevalent in related peer-reviewed work and experimentally show their consequences. We show that the best-practice methodology would fail for this use case. We argue amongst others for the necessity of requirement-aware metrics to ensure achieving business objectives, clear definitions of success criteria, and a thorough analysis of temporal dynamics in experimental datasets. Our work encourages researchers to critically assess their methodologies for more successful applied AI research. The rise of automation in manufacturing has brought significant advancements to production processes. However, are current artificial intelligence (AI) research methodologies ready to create successful, productive, and profitable AI applications? Despite extensive research, the success of industrial AI applications has not kept pace with other industrial automation technologies due to methodological weaknesses. In this work, we address these methodological flaws using a case study on false call reduction in automated optical inspection (AOI) of printed circuit boards (PCBs). AOI systems, which use computer vision to inspect soldering quality, often produce a high number of false calls--incorrect classifications of non-defective PCBs as defective. These false calls consume valuable human resources in manual inspection stages. Our study identifies seven prevalent weaknesses in related research on this topic and demonstrates their negative impacts experimentally.

artificial intelligence, dataset, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2506.14521

Country:

Europe (0.93)
North America > United States (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Anticipating Gaming to Incentivize Improvement: Guiding Agents in (Fair) Strategic Classification

Alhanouti, Sura, Naghizadeh, Parinaz

arXiv.org Artificial IntelligenceMay-12-2025

While the use of MLdriven systems can enhance efficiency, it can also drive the humans who are subject to algorithmic decisions to adjust their behavior accordingly. Examples include Uber drivers coordinating their behavior in response to its surge pricing algorithm [Möhlmann and Zalmanson, 2017], applicants selecting keywords and formatting to pass automated resume screening [Forbes, 2022], and Facebook users adjusting their posting and content interaction choices in response to the platforms' curation algorithms [Eslami et al., 2016]. These can be viewed as strategic responses by rational human subjects in these systems, motivating a game-theoretical analysis of learning algorithms with human in the loop. Earlier works on the study of strategic humans facing ML systems largely focused on scenarios where users can strategically alter only their observable data (e.g., students cheating to obtain better test scores, job applicants making formatting or wording changes to their CV, or loan applicants opening several new accounts to increase their credit scores) to receive a favorable decision (e.g., be accepted to a school, job opening, or loan); see, e.g., [Hu et al., 2019, Milli et al., 2019]. This strategic behavior is referred to as strategic manipulation, where agents change their features without changing their true qualification states. This can be interpreted as cheating the machine learning algorithm: such agents may appear to be more qualified, without being truly suitable for a favorable outcome.

agent, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.05594

Country: North America > United States (0.45)

Genre: Research Report (0.64)

Industry:

Banking & Finance > Credit (0.34)
Education > Assessment & Standards > Student Performance (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

decision threshold

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Threshold Learning for Optimal Decision Making

RS-Del: Edit Distance Robustness Certificates for Sequence Classifiers via Randomized Deletion

meval: A Statistical Toolbox for Fine-Grained Model Performance Analysis

Cost-Aware Prediction (CAP): An LLM-Enhanced Machine Learning Pipeline and Decision Support System for Heart Failure Mortality Prediction

RS-Del: Edit Distance Robustness Certificates for Sequence Classifiers via Randomized Deletion

Critical appraisal of artificial intelligence for rare-event recognition: principles and pharmacovigilance case studies

def130d0b67eb38b7a8f4e7121ed432c-Paper.pdf

The Impact of Pseudo-Science in Financial Loans Risk Prediction

Towards Improved Research Methodologies for Industrial AI: A case study of false call reduction

Anticipating Gaming to Incentivize Improvement: Guiding Agents in (Fair) Strategic Classification